Mutual Information and Redundancy for Categorical Data

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

K-ANMI: A Mutual Information Based Clustering Algorithm for Categorical Data

Clustering categorical data is an integral part of data mining and has attracted much attention recently. In this paper, we present kANMI, a new efficient algorithm for clustering categorical data. The k-ANMI algorithm works in a way that is similar to the popular kmeans algorithm, and the goodness of clustering in each step is evaluated using a mutual information based criterion (namely, avera...

متن کامل

A Crash Course on Shannon's Mutual Information for Categorical Data Analysis

Here a general form of a data set from the eld of information retrieval. A corpus of documents (scienti c papers, say) contains documents already labeled into topics (e.g physics, bio, math), a list of keyword, and a count matrix M where Mx,y is the number of appearances (appropriately normalized to correct for di erences in document length) of word x in any document of topic y. You can nd a fe...

متن کامل

G-ANMI: A mutual information based genetic clustering algorithm for categorical data

Identification of meaningful clusters from categorical data is one key problem in data mining. Recently, Average Normalized Mutual Information (ANMI) has been used to define categorical data clustering as an optimization problem. To find globally optimal or near-optimal partition determined by ANMI, a genetic clustering algorithm (G-ANMI) is proposed in this paper. Experimental results show tha...

متن کامل

Feature selection based on mutual information and redundancy-synergy coefficient.

Mutual information is an important information measure for feature subset. In this paper, a hashing mechanism is proposed to calculate the mutual information on the feature subset. Redundancy-synergy coefficient, a novel redundancy and synergy measure of features to express the class feature, is defined by mutual information. The information maximization rule was applied to derive the heuristic...

متن کامل

Quantifying multivariate redundancy with maximum entropy decompositions of mutual information

Williams and Beer (2010) proposed a nonnegative mutual information decomposition, based on the construction of redundancy lattices, which allows separating the information that a set of variables contains about a target variable into nonnegative components interpretable as the unique information of some variables not provided by others as well as redundant and synergistic components. However, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Communications for Statistical Applications and Methods

سال: 2006

ISSN: 2287-7843

DOI: 10.5351/ckss.2006.13.2.297